Prepare packages and data

library("tidyverse")
## Warning: 程辑包'tidyverse'是用R版本4.3.1 来建造的
## Warning: 程辑包'readr'是用R版本4.3.1 来建造的
## Warning: 程辑包'stringr'是用R版本4.3.1 来建造的
## Warning: 程辑包'forcats'是用R版本4.3.1 来建造的
## Warning: 程辑包'lubridate'是用R版本4.3.1 来建造的
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library("here")
## Warning: 程辑包'here'是用R版本4.3.1 来建造的
## here() starts at D:/JHU/Term 1/Statistical Computing/Sta_com_proj1
library("gapminder")
## Warning: 程辑包'gapminder'是用R版本4.3.1 来建造的
library("stringr")
library("plotly")
## Warning: 程辑包'plotly'是用R版本4.3.1 来建造的
## 
## 载入程辑包:'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout
# tests if a directory named "data" exists locally
if (!dir.exists(here("data"))) {
    dir.create(here("data"))
}

# saves data only once (not each time you knit a R Markdown)
if (!file.exists(here("data", "chocolate.RDS"))) {
    url_csv <- "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv"
    chocolate <- readr::read_csv(url_csv)

    # save the file to RDS objects
    saveRDS(chocolate, file = here("data", "chocolate.RDS"))
}

chocolate <- readRDS(here("data", "chocolate.RDS"))
as_tibble(chocolate)
## # A tibble: 2,530 × 10
##      ref company_manufacturer company_location review_date
##    <dbl> <chr>                <chr>                  <dbl>
##  1  2454 5150                 U.S.A.                  2019
##  2  2458 5150                 U.S.A.                  2019
##  3  2454 5150                 U.S.A.                  2019
##  4  2542 5150                 U.S.A.                  2021
##  5  2546 5150                 U.S.A.                  2021
##  6  2546 5150                 U.S.A.                  2021
##  7  2542 5150                 U.S.A.                  2021
##  8   797 A. Morin             France                  2012
##  9   797 A. Morin             France                  2012
## 10  1011 A. Morin             France                  2013
## # ℹ 2,520 more rows
## # ℹ 6 more variables: country_of_bean_origin <chr>,
## #   specific_bean_origin_or_bar_name <chr>, cocoa_percent <chr>,
## #   ingredients <chr>, most_memorable_characteristics <chr>, rating <dbl>
glimpse(chocolate)
## Rows: 2,530
## Columns: 10
## $ ref                              <dbl> 2454, 2458, 2454, 2542, 2546, 2546, 2…
## $ company_manufacturer             <chr> "5150", "5150", "5150", "5150", "5150…
## $ company_location                 <chr> "U.S.A.", "U.S.A.", "U.S.A.", "U.S.A.…
## $ review_date                      <dbl> 2019, 2019, 2019, 2021, 2021, 2021, 2…
## $ country_of_bean_origin           <chr> "Tanzania", "Dominican Republic", "Ma…
## $ specific_bean_origin_or_bar_name <chr> "Kokoa Kamili, batch 1", "Zorzal, bat…
## $ cocoa_percent                    <chr> "76%", "76%", "76%", "68%", "72%", "8…
## $ ingredients                      <chr> "3- B,S,C", "3- B,S,C", "3- B,S,C", "…
## $ most_memorable_characteristics   <chr> "rich cocoa, fatty, bready", "cocoa, …
## $ rating                           <dbl> 3.25, 3.50, 3.75, 3.00, 3.00, 3.25, 3…

Part 1: Explore data

In this part, use functions from dplyr and ggplot2 to answer the following questions.

  1. Make a histogram of the rating scores to visualize the overall distribution of scores. Change the number of bins from the default to 10, 15, 20, and 25. Pick on the one that you think looks the best. Explain what the difference is when you change the number of bins and explain why you picked the one you did.
for (i in c(10, 15, 20, 25, 30)) {
  print(qplot(rating, data = chocolate, bins = i))
}
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

I would pick 25. As the number of bins increases, the ratings separate from each other. When the number is 25, the bins spread average, and the interval is not too large, so I picked this.

  1. Consider the countries where the beans originated from. How many reviews come from each country of bean origin?
table(chocolate$country_of_bean_origin)
## 
##             Australia                Belize                 Blend 
##                     3                    76                   156 
##               Bolivia                Brazil                 Burma 
##                    80                    78                     1 
##              Cameroon                 China              Colombia 
##                     3                     1                    79 
##                 Congo            Costa Rica                  Cuba 
##                    11                    43                    12 
##    Dominican Republic              DR Congo               Ecuador 
##                   226                     1                   219 
##           El Salvador                  Fiji                 Gabon 
##                     6                    16                     1 
##                 Ghana               Grenada             Guatemala 
##                    41                    19                    62 
##                 Haiti              Honduras                 India 
##                    30                    25                    35 
##             Indonesia           Ivory Coast               Jamaica 
##                    20                     7                    24 
##               Liberia            Madagascar              Malaysia 
##                     3                   177                     8 
##            Martinique                Mexico             Nicaragua 
##                     1                    55                   100 
##               Nigeria                Panama      Papua New Guinea 
##                     3                     9                    50 
##                  Peru           Philippines              Principe 
##                   244                    24                     1 
##           Puerto Rico                 Samoa              Sao Tome 
##                     7                     3                    14 
##   Sao Tome & Principe          Sierra Leone       Solomon Islands 
##                     2                     4                    10 
##             Sri Lanka             St. Lucia St.Vincent-Grenadines 
##                     2                    10                     1 
##              Sulawesi               Sumatra              Suriname 
##                     1                     1                     1 
##                Taiwan              Tanzania              Thailand 
##                     2                    79                     5 
##                Tobago                  Togo              Trinidad 
##                     2                     3                    42 
##                U.S.A.                Uganda               Vanuatu 
##                    33                    19                    13 
##             Venezuela               Vietnam 
##                   253                    73
  1. What is average rating scores from reviews of chocolate bars that have Ecuador as country_of_bean_origin in this dataset? For this same set of reviews, also calculate (1) the total number of reviews and (2) the standard deviation of the rating scores. Your answer should be a new data frame with these three summary statistics in three columns. Label the name of these columns mean, sd, and total.

If you just want to see statistics for Ecuador:

ecu <- filter(chocolate, country_of_bean_origin == "Ecuador")
tibble(
  mean = mean(ecu$rating, na.rm = T),
  sd = sd(ecu$rating, na.rm = T),
  total = count(ecu)
)
## # A tibble: 1 × 3
##    mean    sd total$n
##   <dbl> <dbl>   <int>
## 1  3.16 0.512     219

If you want to see these three summary statistics for all countries:

chocolate %>%
  group_by(country_of_bean_origin) %>%
  summarize(
    mean = mean(rating, na.rm = T),
    sd = sd(rating, na.rm = T),
    total = n()
  ) %>%
  print()
## # A tibble: 62 × 4
##    country_of_bean_origin  mean     sd total
##    <chr>                  <dbl>  <dbl> <int>
##  1 Australia               3.25  0.5       3
##  2 Belize                  3.23  0.325    76
##  3 Blend                   3.04  0.638   156
##  4 Bolivia                 3.18  0.436    80
##  5 Brazil                  3.26  0.417    78
##  6 Burma                   3    NA         1
##  7 Cameroon                3.08  0.144     3
##  8 China                   3.5  NA         1
##  9 Colombia                3.20  0.423    79
## 10 Congo                   3.32  0.318    11
## # ℹ 52 more rows
  1. Which company (name) makes the best chocolate (or has the highest ratings on average) with beans from Ecuador?
ecu %>%
  group_by(company_manufacturer) %>%
  summarize(
    mean_rating = mean(rating, na.rm = T)
  ) %>%
  arrange(desc(mean_rating))
## # A tibble: 136 × 2
##    company_manufacturer   mean_rating
##    <chr>                        <dbl>
##  1 Amano                         4   
##  2 Benoit Nihant                 4   
##  3 Beschle (Felchlin)            4   
##  4 Durci                         4   
##  5 Smooth Chocolator, The        4   
##  6 Domori                        3.88
##  7 A. Morin                      3.75
##  8 Cacao Sampaka                 3.75
##  9 Foundry                       3.75
## 10 Goodnow Farms                 3.75
## # ℹ 126 more rows
  1. Calculate the average rating across all country of origins for beans. Which top 3 countries (for bean origin) have the highest ratings on average?
chocolate %>%
  group_by(country_of_bean_origin) %>%
  summarize(
    mean_rating = mean(rating, na.rm = T)
  ) %>%
  arrange(desc(mean_rating))
## # A tibble: 62 × 2
##    country_of_bean_origin mean_rating
##    <chr>                        <dbl>
##  1 Tobago                        3.62
##  2 China                         3.5 
##  3 Sao Tome & Principe           3.5 
##  4 Solomon Islands               3.45
##  5 Congo                         3.32
##  6 Thailand                      3.3 
##  7 Cuba                          3.29
##  8 Vietnam                       3.29
##  9 Papua New Guinea              3.28
## 10 Madagascar                    3.27
## # ℹ 52 more rows

Top 3 countries with the highest ratings on average: Tobago, China, and Sao Tome & Principe

  1. Following up on the previous problem, now remove any countries of bean origins that have less than 10 chocolate bar reviews. Now, which top 3 countries have the highest ratings on average?
chocolate %>%
  group_by(country_of_bean_origin) %>%
  summarize(
    mean_rating = mean(rating, na.rm = T),
    review_number = n()
  ) %>%
  filter(review_number >= 10) %>%
  arrange(desc(mean_rating))
## # A tibble: 35 × 3
##    country_of_bean_origin mean_rating review_number
##    <chr>                        <dbl>         <int>
##  1 Solomon Islands               3.45            10
##  2 Congo                         3.32            11
##  3 Cuba                          3.29            12
##  4 Vietnam                       3.29            73
##  5 Papua New Guinea              3.28            50
##  6 Madagascar                    3.27           177
##  7 Haiti                         3.27            30
##  8 Brazil                        3.26            78
##  9 Guatemala                     3.26            62
## 10 Nicaragua                     3.26           100
## # ℹ 25 more rows

Top 3 countries with the highest ratings on average: Solomon Islands, Congo, and Cuba

  1. For this last part, let’s explore the relationship between percent chocolate and ratings.

Use the functions in dplyr, tidyr, and lubridate to perform the following steps to the chocolate dataset:

  1. Identify the countries of bean origin with at least 50 reviews. Remove reviews from countries are not in this list.
chocolate %>%
  group_by(country_of_bean_origin) %>%
  summarize(
    review_number = n()
  ) -> rev_num

chocolate2 <- left_join(chocolate, rev_num, by = "country_of_bean_origin")
chocolate2 <- filter(chocolate2, review_number >= 50)
  1. Using the variable describing the chocolate percentage for each review, create a new column that groups chocolate percentages into one of four groups: (i) <60%, (ii) >=60 to <70%, (iii) >=70 to <90%, and (iii) >=90% (Hint check out the substr() function in base R and the case_when() function from dplyr – see example below).
chocolate2 %>% mutate(percent_group = case_when(
  substr(cocoa_percent,1,1) < 6 ~ "<60%",
  substr(cocoa_percent,1,1) == 6 ~ ">=60 to <70%",
  substr(cocoa_percent,1,1) == 7 | substr(cocoa_percent,1,1) == 8 ~ ">=70 to <90%",
  substr(cocoa_percent,1,1) > 8 ~ ">=90%",
)) -> chocolate2
  1. Using the new column described in #2, re-order the factor levels (if needed) to be starting with the smallest percentage group and increasing to the largest percentage group (Hint check out the fct_relevel() function from forcats).
chocolate2$percent_group <-  factor(chocolate2$percent_group, levels = c("<60%", ">=60 to <70%", ">=70 to <90%", ">=90%"))
chocolate2 <- arrange(chocolate2, percent_group)
  1. For each country, make a set of four side-by-side boxplots plotting the groups on the x-axis and the ratings on the y-axis. These plots should be faceted by country.
qplot(x = percent_group, y = rating, data = chocolate2, geom = "boxplot", facets = . ~ country_of_bean_origin)

On average, which category of chocolate percentage is most highly rated? Do these countries mostly agree or are there disagreements?

Chocolate with a percentage of “>=60 to <70%” or “>=70 to <90%” is most highly rated. For many countries, these two categories both got the highest rating. Most countries agree with the highest rating for middle-percentage chocolate. A few countries showed particularly low ratings for the “<60%” or “>=90%” group.

Part 2: Join two datasets together

  1. Use this dataset it to create a new column called continent in our chocolate dataset that contains the continent name for each review where the country of bean origin is.
  2. Only keep reviews that have reviews from countries of bean origin with at least 10 reviews.
  3. Also, remove the country of bean origin named "Blend".
  4. Make a set of violin plots with ratings on the y-axis and continents on the x-axis.
gapminder %>%
  filter(year == 2007) %>%
  rename(country_of_bean_origin = country) %>%
  select(c("country_of_bean_origin", "continent")) -> gap2
gap3 <- read.csv(here("continent_new.csv"))
gap2 <- rbind(gap2, gap3)
chocolate3 <- left_join(chocolate, gap2, by = "country_of_bean_origin")

left_join(chocolate3, rev_num, by = "country_of_bean_origin") %>%
  filter(review_number >= 10) %>%
  filter(country_of_bean_origin != "Blend") -> chocolate3

qplot(x = continent, y = rating, data = chocolate3, geom = "violin")

Part 3: Convert wide data into long data

We are going to create a set of features for us to plot over time. Use the functions in dplyr and tidyr to perform the following steps to the chocolate dataset:

  1. Create a new set of columns titled beans, sugar, cocoa_butter, vanilla, letchin, and salt that contain a 1 or 0 representing whether or not that review for the chocolate bar contained that ingredient (1) or not (0).
  2. Create a new set of columns titled char_cocoa, char_sweet, char_nutty, char_creamy, char_roasty, char_earthy that contain a 1 or 0 representing whether or not that the most memorable characteristic for the chocolate bar had that word (1) or not (0). For example, if the word “sweet” appears in the most_memorable_characteristics, then record a 1, otherwise a 0 for that review in the char_sweet column (Hint: check out str_detect() from the stringr package).
  3. For each year (i.e. review_date), calculate the mean value in each new column you created across all reviews for that year. (Hint: If all has gone well thus far, you should have a dataset with 16 rows and 13 columns).
  4. Convert this wide dataset into a long dataset with a new feature and mean_score column.
chocolate %>%
  mutate(
    beans = ifelse(str_detect(ingredients, "B"), 1, 0),
    sugar = ifelse(str_detect(ingredients, "S"), 1, 0),
    cocoa_butter = ifelse(str_detect(ingredients, "C"), 1, 0),
    vanilla = ifelse(str_detect(ingredients, "V"), 1, 0),
    letchin = ifelse(str_detect(ingredients, "L"), 1, 0),
    salt = ifelse(str_detect(ingredients, "Sa"), 1, 0),
    char_cocoa = ifelse(str_detect(most_memorable_characteristics, "cocoa"), 1, 0),
    char_sweet = ifelse(str_detect(most_memorable_characteristics, "sweet"), 1, 0),
    char_nutty = ifelse(str_detect(most_memorable_characteristics, "nutty"), 1, 0),
    char_creamy = ifelse(str_detect(most_memorable_characteristics, "creamy"), 1, 0),
    char_roasty = ifelse(str_detect(most_memorable_characteristics, "roasty"), 1, 0),
    char_earthy = ifelse(str_detect(most_memorable_characteristics, "earthy"), 1, 0)
  ) -> chocolate4

chocolate4 %>%
  group_by(review_date) %>%
  summarize(
    beans = mean(beans, na.rm = T),
    sugar = mean(sugar, na.rm = T),
    cocoa_butter = mean(cocoa_butter, na.rm = T),
    vanilla = mean(vanilla, na.rm = T),
    letchin = mean(letchin, na.rm = T),
    salt = mean(salt, na.rm = T),
    char_cocoa = mean(char_cocoa, na.rm = T),
    char_sweet = mean(char_sweet, na.rm = T),
    char_nutty = mean(char_nutty, na.rm = T),
    char_creamy = mean(char_creamy, na.rm = T),
    char_roasty = mean(char_roasty, na.rm = T),
    char_earthy = mean(char_earthy, na.rm = T)
  ) %>%
  tibble() -> chocolate5
print(chocolate5)
## # A tibble: 16 × 13
##    review_date beans sugar cocoa_butter vanilla letchin    salt char_cocoa
##          <dbl> <dbl> <dbl>        <dbl>   <dbl>   <dbl>   <dbl>      <dbl>
##  1        2006     1 1            0.933  0.717   0.717  0           0.210 
##  2        2007     1 1            0.812  0.580   0.406  0           0.342 
##  3        2008     1 0.988        0.821  0.393   0.560  0           0.109 
##  4        2009     1 1            0.841  0.354   0.372  0           0.146 
##  5        2010     1 1            0.830  0.266   0.457  0.0106      0.218 
##  6        2011     1 1            0.739  0.170   0.170  0.0523      0.172 
##  7        2012     1 1            0.728  0.2     0.133  0.0778      0.0876
##  8        2013     1 0.989        0.802  0.215   0.305  0.0169      0.175 
##  9        2014     1 1            0.654  0.0700  0.123  0.0329      0.0607
## 10        2015     1 0.993        0.554  0.0607  0.121  0           0.127 
## 11        2016     1 0.995        0.606  0.0516  0.108  0.00939     0.0922
## 12        2017     1 1            0.573  0.0291  0.136  0.00971     0.133 
## 13        2018     1 1            0.604  0.0622  0.133  0           0.180 
## 14        2019     1 1            0.679  0.0259  0.202  0           0.259 
## 15        2020     1 1            0.568  0.0370  0.0247 0           0.284 
## 16        2021     1 0.994        0.646  0.0114  0.08   0           0.297 
## # ℹ 5 more variables: char_sweet <dbl>, char_nutty <dbl>, char_creamy <dbl>,
## #   char_roasty <dbl>, char_earthy <dbl>
chocolate5 %>%
  pivot_longer(-review_date, names_to = "feature", values_to = "mean_score") -> chocolate6

Part 4: Data visualization

Use the functions in ggplot2 package to make a scatter plot of the mean_scores (y-axis) over time (x-axis). One point for each mean_score. For full credit, your plot should include:

  1. An overall title for the plot and a subtitle summarizing key trends that you found. Also include a caption in the figure with your name.
  2. Both the observed points for the mean_score, but also a smoothed non-linear pattern of the trend
  3. All plots should be shown in the one figure
  4. There should be an informative x-axis and y-axis label

Here is a common plot:

chocolate6 %>%
  ggplot(aes(review_date, mean_score)) +
  geom_point(color = "pink", size = 3, alpha = 1/2) +
  geom_smooth(formula = y~x, color = "purple", method = "loess") +
  theme_bw() +
  labs(title = "Chocolates have less ingredients and features as time passes", subtitle = "Lower mean scores for main ingredients and memorable characteristics of chocolates were generally observed as time passes", x = "Time", y = "Mean scores of ingredients and characteristics", caption = "Zeyu Li")

I love this colorful one more!

chocolate6 %>%
  ggplot(aes(review_date, mean_score)) +
  geom_point(aes(color = feature), size = 3, alpha = 1/3) +
  geom_smooth(formula = y~x, aes(color = feature), method = "loess", linewidth = 0.5, se = F) +
  theme_bw() +
  labs(title = "Chocolates have less ingredients and features as time passes", subtitle = "Lower mean scores for main ingredients and memorable characteristics of chocolates were generally observed as time passes", x = "Time", y = "Mean scores of ingredients and characteristics", caption = "Zeyu Li")

Part 5: Make the worst plot you can!

Using the chocolate dataset (or any of the modified versions you made throughout this assignment or anything else you wish you build upon it):

  1. Make the absolute worst plot that you can. You need to customize it in at least 7 ways to make it awful.
  2. In your document, write 1 - 2 sentences about each different customization you added (using bullets – i.e. there should be at least 7 bullet points each with 1-2 sentences), and how it could be useful for you when you want to make an awesome data visualization.
chocolate2 %>%
  ggplot(aes(review_date, rating)) +
  geom_point(size = 7, alpha = 1/10) +
  geom_smooth(formula = y~x, method = "lm", linewidth = 4, linetype = 3) +
  theme_dark() +
  facet_grid(. ~ percent_group) +
  labs(title = "Don't know what this is", subtitle = "Still no idea", x = "Don't specify what the time is of", y = "Don't specify what the ratings are for")

  1. Didn’t set bright colors for points. We should choose bright and high contrast colors for points to make them apparent to be seen.
  2. The size of points is too large. We should choose appropriate size to make the points can separate with each other and can be seen clearly.
  3. The transparency of points is too high. We shouldn’t set the alpha too low to make the points can be seen easily.
  4. The linewidth is too large. We should make the lines thin accordingly, or we can’t see the trend clearly.
  5. The linetype is not appropriate. We should choose an appropriate linetype to make it more apparent.
  6. The theme is too dark, especially as the points are black. As an appropriate theme, the points and lines should have totally different colors compared with the background. For example, using a white background will be much better.
  7. The text of the plot is not informative. An informative overall title, a descriptive subtitle, clear x and y axis labels, and a caption should be included in a plot to provide more informative for the readers.
  8. The plot was not faceted appropriately. It will be better if the facet variable is put on the right side, and the points and lines are set as different colors. Another solution, use plots and lines with different colors in a single plot instead of the faceted plot, so that people can compare the trends of chocolates with different percentages.

Part 6: Make my plot a better plot!

The goal is to take my sad looking plot and make it better! If you’d like an example, here is a tweet I came across of someone who gave a talk about how to zhoosh up your ggplots.

chocolate %>%
    ggplot(aes(
        x = as.factor(review_date),
        y = rating,
        fill = review_date
    )) +
    geom_violin()

  1. You need to customize it in at least 7 ways to make it better.
  2. In your document, write 1 - 2 sentences about each different customization you added (using bullets – i.e. there should be at least 7 bullet points each with 1-2 sentences), describing how you improved it.
chocolate %>%
    ggplot(aes(
        x = review_date,
        y = rating,
        fill = as.factor(review_date)
    )) +
    geom_boxplot() +
    theme_bw(base_size = 12) +
    theme(legend.position = "none") +
    labs(title = "Which year has chocolate with higher ratings?", subtitle = "The ratings for chocolates were different across years", x = "Year of review", y = "Ratings for chocolates", caption = "L. Collado-Torres, Z. Li. 2023") -> g
    plotly::ggplotly(g)
  1. Changed the plot pattern as a boxplot. The main advantage of violin plot is showing the probability density at different values, which is particularly useful when there are multiple peaks. But there are only one peak in most years. So the boxplot will be more easy to read and contains more information.
  2. Changed the color of plots. The gradually changing blue of the original plot was too close across years. Rainbow color will be more clear.
  3. Changed the theme. The original grey theme is not good-looking. White background is more appropriate especially for a plot with so many colors.
  4. Changed the font size of labels to make them more easy to read.
  5. Removed the legend. It’s clear to read the years on the x-axis, legend is not needed. Removing it will provide more space.
  6. Added titles. Good title will be more attractive for people to read. Also, it will provide a summary of the plot.
  7. Changed text of x and y axis. Clear text will provide more information to readers.
  8. Made the plot interactive. Readers can get more details statistics by interacting with the plot.

R session information

options(width = 120)
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.0 (2023-04-21 ucrt)
##  os       Windows 10 x64 (build 19045)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  Chinese (Simplified)_China.utf8
##  ctype    Chinese (Simplified)_China.utf8
##  tz       America/New_York
##  date     2023-09-17
##  pandoc   3.1.1 @ D:/安装/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  bslib         0.5.0   2023-06-09 [1] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.1)
##  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
##  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
##  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.3.1)
##  crosstalk     1.2.0   2021-11-04 [1] CRAN (R 4.3.1)
##  data.table    1.14.8  2023-02-17 [1] CRAN (R 4.3.1)
##  digest        0.6.31  2022-12-11 [1] CRAN (R 4.3.1)
##  dplyr       * 1.1.2   2023-04-20 [1] CRAN (R 4.3.0)
##  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.1)
##  evaluate      0.21    2023-05-05 [1] CRAN (R 4.3.1)
##  fansi         1.0.4   2023-01-22 [1] CRAN (R 4.3.0)
##  farver        2.1.1   2022-07-06 [1] CRAN (R 4.3.0)
##  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
##  forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.1)
##  gapminder   * 1.0.0   2023-03-10 [1] CRAN (R 4.3.1)
##  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
##  ggplot2     * 3.4.2   2023-04-03 [1] CRAN (R 4.3.0)
##  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.0)
##  gtable        0.3.3   2023-03-21 [1] CRAN (R 4.3.0)
##  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.3.1)
##  highr         0.10    2022-12-22 [1] CRAN (R 4.3.1)
##  hms           1.1.3   2023-03-21 [1] CRAN (R 4.3.1)
##  htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.3.1)
##  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.1)
##  httr          1.4.6   2023-05-08 [1] CRAN (R 4.3.1)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.1)
##  jsonlite      1.8.5   2023-06-05 [1] CRAN (R 4.3.1)
##  knitr         1.43    2023-05-25 [1] CRAN (R 4.3.1)
##  labeling      0.4.2   2020-10-20 [1] CRAN (R 4.3.0)
##  lattice       0.21-8  2023-04-05 [2] CRAN (R 4.3.0)
##  lazyeval      0.2.2   2019-03-15 [1] CRAN (R 4.3.1)
##  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.0)
##  lubridate   * 1.9.2   2023-02-10 [1] CRAN (R 4.3.1)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
##  Matrix        1.6-1   2023-08-14 [1] CRAN (R 4.3.1)
##  mgcv          1.8-42  2023-03-02 [2] CRAN (R 4.3.0)
##  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
##  nlme          3.1-162 2023-01-31 [2] CRAN (R 4.3.0)
##  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
##  plotly      * 4.10.2  2023-06-03 [1] CRAN (R 4.3.1)
##  purrr       * 1.0.1   2023-01-10 [1] CRAN (R 4.3.0)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
##  readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.3.1)
##  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.0)
##  rmarkdown     2.22    2023-06-01 [1] CRAN (R 4.3.1)
##  rprojroot     2.0.3   2022-04-02 [1] CRAN (R 4.3.1)
##  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.3.1)
##  sass          0.4.6   2023-05-03 [1] CRAN (R 4.3.1)
##  scales        1.2.1   2022-08-20 [1] CRAN (R 4.3.0)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
##  stringi       1.7.12  2023-01-11 [1] CRAN (R 4.3.0)
##  stringr     * 1.5.0   2022-12-02 [1] CRAN (R 4.3.1)
##  tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
##  tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.3.0)
##  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
##  tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.1)
##  timechange    0.2.0   2023-01-11 [1] CRAN (R 4.3.1)
##  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.3.1)
##  utf8          1.2.3   2023-01-31 [1] CRAN (R 4.3.0)
##  vctrs         0.6.2   2023-04-19 [1] CRAN (R 4.3.0)
##  viridisLite   0.4.2   2023-05-02 [1] CRAN (R 4.3.0)
##  withr         2.5.0   2022-03-03 [1] CRAN (R 4.3.0)
##  xfun          0.39    2023-04-20 [1] CRAN (R 4.3.1)
##  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
## 
##  [1] C:/Users/13392/AppData/Local/R/win-library/4.3
##  [2] D:/安装/R-4.3.0/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────